翻訳と辞書 |
Proto-value functions : ウィキペディア英語版 | Proto-value functions
In applied mathematics, proto-value functions (PVFs) are automatically learned basis functions that are useful in approximating task-specific value functions, providing a compact representation of the powers of transition matrices. They provide a novel framework for solving the credit assignment problem. The framework introduces a novel approach to solving Markov decision processes (MDP) and reinforcement learning problems, using multiscale spectral and manifold learning methods. Proto-value functions are generated by spectral analysis of a graph, using the graph Laplacian. Proto-value functions were first introduced in the context of reinforcement learning by Sridhar Mahadevan in his paper, ''Proto-Value Functions: Developmental Reinforcement Learning'' at ICML 2005.〔Mahadevan, S. (Proto-Value Functions: Developmental Reinforcement Learning ). Proceedings of the International Conference on Machine Learning ICML 2005〕 == Motivation == Value function approximation is a critical component to solving MDPs defined over a continuous state space. A good function approximator allows an RL agent to accurately represent the value of any state it has experienced, without explicitly storing its value. Linear function approximation using basis functions is a common way of constructing a value function approximation, like Radial basis functions, polynomial state encodings, and CMACs. However, parameters associated with these basis functions often require significant domain-specific hand-engineering.〔Johns, J. and Mahadevan, S., (Constructing Basis Functions from Directed Graphs for Value Function Approximation ), International Conference on Machine Learning (ICML), 2007〕 Proto-value functions attempts to solve this required hand-engineering by accounting for the underlying manifold structure of the problem domain.〔
抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Proto-value functions」の詳細全文を読む
スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース |
Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.
|
|